You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christopher Tignor <ct...@thinkmap.com> on 2009/10/01 21:45:43 UTC

TermPositions with custom Tokenizer

Hello,

I have created a custom Tokenizer and am trying to set and extract my own
positions for each Token using:

reusableToken.reinit(word.getWord(),tokenStart,tokenEnd);

later when querying my index using a SpanTermQuery the start() and end()
tags don't correspond to these values but seem to correspond to the order
the token was tokenized during the indexing process, e.g.

start: 5
end: 6

for a given token.  I realize that the these values come from TermPositions
but how can I effectively get my custom toke nstart and end offsets into
TermPositions for recovery?

thanks -

C>T>

-- 
TH!NKMAP

Christopher Tignor | Senior Software Architect
155 Spring Street NY, NY 10012
p.212-285-8600 x385 f.212-285-8999