You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Andrzej Bialecki <ab...@getopt.org> on 2012/04/02 19:02:55 UTC
Re: delete entries from posting list Lucene 4.0
On 29/03/2012 11:14, Andrzej Bialecki wrote:
> The problem in our implementation is that we use a within-document term
> frequency (the number of occurrences of t in the current document) and
> not a collection-wide term frequency... so, it looks to me that the fix
> would be to first fully traverse the doc enumeration and calculate the
> total number of term occurrences in all documents (e.g. in
> RIDFTermPruningPolicy.initPositionsTerm(..) ), and use this value in the
> formula in place of termPositions.freq().
>
This is the fix that I implemented, it's now committed to branch_3x and
will be included in release 3.6.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: delete entries from posting list Lucene 4.0
Posted by "Zeynep P." <zp...@yahoo.com>.
Hi,
Thanks for the fix.
I also wonder if you know any collection (free ones) to test pruning
approaches. Almost all the papers use TREC collections which I don't have!!
For now, I use Reuters21578 collection and Carmel's Kendall's tau extension
to measure similarity. But I need a collection with relevance judgements.
Thanks in advance,
Best Regards
ZP
--
View this message in context: http://lucene.472066.n3.nabble.com/delete-entries-from-posting-list-Lucene-4-0-tp3838649p3933206.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org