You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Zeynep P." <zp...@yahoo.com> on 2012/08/14 15:53:01 UTC

pruning package- question about termpositions && skipTo

Hi to all,

In pruning package, for pruneAllPositions(TermPositions termPositions, Term
t) methos it is said that :

"termPositions - positioned term positions. Implementations MUST NOT advance
this by calling TermPositions methods that advance either the position
pointer (next, skipTo) or term pointer (seek)."

Why??

Why do I need to do skipTo :

I added a new pruning class with   public void
initPositionsTerm(TermPositions tp, Term t, ScoreDoc[] sdoc)  method. I
needed it because my ScoreDoc[] is generated with different external
parameters based on lucene basic results. And then in initPositionsTerm
method, instead of letting method to get docs like in other classes, it is
just equal to sdocs. For example, for a term x, sdocs = {42813, 123472,
22477, 76995,  47086, 106424, 68570, 26708, 49740, 116472}, sorted docs =
{22477, 26708, 42813, 47086, ...}. I just want to keep these postings in my
pruned index.

The problem is that when I call pruneAllPositions as it is, it returns me
only {22477, 26708, *107377*} After 28118 super.next() is false in
PruningTermPositions.next(). So it returns never  true for
(termPositions.doc() == docs[docsPos].doc) with docIds > 28118.( I have no
idea where it comes 107377, it is not even in my docs). However, in
pruneAllPositions when I check termpositions with the code above I have all
docids that I need in it. That is why I wonder why I can not do skipTo and
why that happens with termspositions ?????? 

while(termPositions.next())
{
       System.out.println(termPositions.doc() );
} 

Thanks in advance,
Best Regards



--
View this message in context: http://lucene.472066.n3.nabble.com/pruning-package-question-about-termpositions-skipTo-tp4001160.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: pruning package- question about termpositions && skipTo

Posted by "Zeynep P." <zp...@yahoo.com>.
Hi to all,

I found the problem and the solution. In PruningReader
super.getSequentialSubReaders(); is used. After 28118 super.next()  is false
because it is a subreader for a segment and indexreader.maxDoc() is equal to
28118 for that segment. In pruneAllPositions, instead of comparing
termpostions.doc to docid, I compared
in.document(termPositions.doc()).getField("docid").stringValue() to docid. 

It happened because of my custom  initPositionsTerm method. (public void
initPositionsTerm(TermPositions tp, Term t, *ScoreDoc[] sdoc*) ). There is
no problem with other pruning policies.

DocID  ****** termPositions.doc()
22477 ******** 22477
26708 ******** 26708
42813 ******** 14093
47086 ******** 18366
49740 ******** 21020
68570 ******** 11760
76995 ******** 20185
106424 ******** 21524
116472 ******** 502
123472 ******** 1992

Best Regards





--
View this message in context: http://lucene.472066.n3.nabble.com/pruning-package-question-about-termpositions-skipTo-tp4001160p4002656.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org