You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by vrparekh <vr...@gmail.com> on 2013/03/15 10:51:11 UTC

how to get term vector information of sepcific word/position in field

Hello,

currently when we set qt=tvrh&tv.all=true; it will return all the words
which are there in text of field.

is there any way, if i can get term vector information of specific word
only, like i can pass the word, and it will just return term position and
frequency for that word only?

and also if i can pass the position e.g. startPosition=5 and endPosition=10;
then it will return terms, positions and frequency of words which are there
occurred inbeween start and end postion.





--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-term-vector-information-of-sepcific-word-position-in-field-tp4047637.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to get term vector information of sepcific word/position in field

Posted by vrparekh <vr...@gmail.com>.
Thanks Chris,





--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-term-vector-information-of-sepcific-word-position-in-field-tp4047637p4050997.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to get term vector information of sepcific word/position in field

Posted by Chris Hostetter <ho...@fucit.org>.
: is there any way, if i can get term vector information of specific word
: only, like i can pass the word, and it will just return term position and
: frequency for that word only?
: 
: and also if i can pass the position e.g. startPosition=5 and endPosition=10;
: then it will return terms, positions and frequency of words which are there
: occurred inbeween start and end postion.

I don't think either of these are available out of hte box, but you could 
probably modify the code in TermVectoryComponent that iterates over terms 
to filter what it adds to the response based on explicitly bassed in 
"term" "startPos" and "endPos" params.

It would not only cut down on the total data being returned, but since you 
can do a seek on a TermsEnum limiting that way should speed up hte 
processing as well.  i don't think you can seek on term positions 
however, so you'd still have to iterate over all the positions until you 
found the startPos, but bailing out once you reach the endPos may save 
some time as well.

If you do go this route, by all means please submit a patch in jira, it 
could be handy for other TVC users...

https://wiki.apache.org/solr/HowToContribute
https://issues.apache.org/jira/browse/SOLR


-Hoss

Re: how to get term vector information of sepcific word/position in field

Posted by vrparekh <vr...@gmail.com>.
The requirement might seems weird, but the text field is big, and to get term
vector information for 10 records in response will decrease the speed.  and
also i don't want term vector information of all the words.

Is there any possible solution ? 



--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-get-term-vector-information-of-sepcific-word-position-in-field-tp4047637p4048433.html
Sent from the Solr - User mailing list archive at Nabble.com.