You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Andrzej Bialecki <ab...@getopt.org> on 2012/04/02 19:02:55 UTC

Re: delete entries from posting list Lucene 4.0

On 29/03/2012 11:14, Andrzej Bialecki wrote:

> The problem in our implementation is that we use a within-document term
> frequency (the number of occurrences of t in the current document) and
> not a collection-wide term frequency... so, it looks to me that the fix
> would be to first fully traverse the doc enumeration and calculate the
> total number of term occurrences in all documents (e.g. in
> RIDFTermPruningPolicy.initPositionsTerm(..) ), and use this value in the
> formula in place of termPositions.freq().
>

This is the fix that I implemented, it's now committed to branch_3x and 
will be included in release 3.6.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: delete entries from posting list Lucene 4.0

Posted by "Zeynep P." <zp...@yahoo.com>.
Hi,

Thanks for the fix. 

I also wonder if you know any collection (free ones) to test pruning
approaches. Almost all the papers use TREC collections which I don't have!!
For now, I use Reuters21578 collection and Carmel's Kendall's tau extension
to measure similarity. But I need a collection with relevance judgements. 

Thanks in advance,
Best Regards
ZP

--
View this message in context: http://lucene.472066.n3.nabble.com/delete-entries-from-posting-list-Lucene-4-0-tp3838649p3933206.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org