You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christopher Tignor <ct...@thinkmap.com> on 2009/10/01 21:45:43 UTC
TermPositions with custom Tokenizer
Hello,
I have created a custom Tokenizer and am trying to set and extract my own
positions for each Token using:
reusableToken.reinit(word.getWord(),tokenStart,tokenEnd);
later when querying my index using a SpanTermQuery the start() and end()
tags don't correspond to these values but seem to correspond to the order
the token was tokenized during the indexing process, e.g.
start: 5
end: 6
for a given token. I realize that the these values come from TermPositions
but how can I effectively get my custom toke nstart and end offsets into
TermPositions for recovery?
thanks -
C>T>
--
TH!NKMAP
Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999