You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Matt Chaput <ma...@sidefx.com> on 2007/03/22 10:42:02 UTC
Positions vs. Term Vectors
Hi, another abstract implementation question:
Per Term Position (prox) data vs. Per Doc Term Vectors. Belt and
Suspenders? Can't Term Vectors effectively (performantly) replace
position data for doing phrase matches? Is there another use of
position data that term vectors doesn't satisfy? Does each have pros
and cons? Or if you were implementing Lucene from scratch, would you
just implement term vectors and forget positions?
Cheers,
Matt
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
Re: Positions vs. Term Vectors
Posted by karl wettin <ka...@gmail.com>.
22 mar 2007 kl. 10.42 skrev Matt Chaput:
> Per Term Position (prox) data vs. Per Doc Term Vectors. Belt and
> Suspenders? Can't Term Vectors effectively (performantly) replace
> position data for doing phrase matches? Is there another use of
> position data that term vectors doesn't satisfy? Does each have
> pros and cons? Or if you were implementing Lucene from scratch,
> would you just implement term vectors and forget positions?
Term posisitions are stored next to the term because that is where
the context cursor is located when placing queries (inversed index
access). The term vectos is the oppsite thing, accessing the
positions based on a document (vector space model). So to answer you
question, as Lucene is an inverted index, term positions can not be
replaced by term vectors to get the same or better performace at
query time. The term vector is (as I see it) a cached vector space
model.
--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org