You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by khartnjava <fo...@gmail.com> on 2011/01/09 12:30:17 UTC

How to remove dublicates from Lucene index?

Hello.
I have index with 3 field - 
path_to_file - stored, not analyzed - unique path to file
file_content - stored, not analyzed - file's content
file_content_int - analyzed - file's content
How to find and delete dublicates in file_content field?
I have find http://open.vinayras.com/lucene_duplicate_remover 
but with lucene 3.x he don't work...
Please, sorry my English.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/How-to-remove-dublicates-from-Lucene-index-tp2220756p2220756.html
Sent from the Lucene - General mailing list archive at Nabble.com.