You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "K. M. McCormick" <ky...@gmail.com> on 2009/10/26 20:31:38 UTC

Merging Indexes

Hello:

Currently, I have a large index being build in sections. The indexing
program has multiple threads which it uses to optimize time; each thread
makes its own, separate index to avoid threads fighting over resources. At
the end of the program, the indexes are merged into a single index.

...I've read both that merging deletes and does not delete the indexes
merged. It looks like they're not being deleted, since when I check on the
sizes of each index, I see the following:

SIZE------INDEX NAME/DIRECTORY
361M    term2gm/WorkerThread0
357M    term2gm/WorkerThread1
449M    term2gm/WorkerThread2
404M    term2gm/WorkerThread3
359M    term2gm/WorkerThread4
274M    term2gm/WorkerThread5
428M    term2gm/WorkerThread6
306M    term2gm/WorkerThread7
317M    term2gm/WorkerThread8
309M    term2gm/WorkerThread9

2.8G    term2gm/finalindex_term2gm

The final index is the merged index at the end of the program. Is it safe
for me to delete the WorkerThread# indexes, or do I need to keep them so
that my final, merged index is maintained?

Also, if I merge indexes later, will the original indexes be kept, or
deleted, after the merge? Like I said, I've read it both ways, and I want to
be sure of how it is with Lucene 2.4.1 / Lucene 2.9

Thanks!
Kylie

-- 
The Circle of the Dragon -- dragon history and mystery
http://www.blackdrago.com/index.html

Online Resume and Portfolio
http://www.kyliemccormick.com/

"Light, seeking light, doth the light of light beguile!"
-- William Shakespeare's Love's Labor's Lost

Re: Merging Indexes

Posted by Chris Lu <ch...@gmail.com>.
Pretty sure you can delete the small indexes after the merge.

BTW: How long does your indexing and merging take respectively?

-- 

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding!



K. M. McCormick wrote:
> Hello:
>
> Currently, I have a large index being build in sections. The indexing
> program has multiple threads which it uses to optimize time; each thread
> makes its own, separate index to avoid threads fighting over resources. At
> the end of the program, the indexes are merged into a single index.
>
> ...I've read both that merging deletes and does not delete the indexes
> merged. It looks like they're not being deleted, since when I check on the
> sizes of each index, I see the following:
>
> SIZE------INDEX NAME/DIRECTORY
> 361M    term2gm/WorkerThread0
> 357M    term2gm/WorkerThread1
> 449M    term2gm/WorkerThread2
> 404M    term2gm/WorkerThread3
> 359M    term2gm/WorkerThread4
> 274M    term2gm/WorkerThread5
> 428M    term2gm/WorkerThread6
> 306M    term2gm/WorkerThread7
> 317M    term2gm/WorkerThread8
> 309M    term2gm/WorkerThread9
>
> 2.8G    term2gm/finalindex_term2gm
>
> The final index is the merged index at the end of the program. Is it safe
> for me to delete the WorkerThread# indexes, or do I need to keep them so
> that my final, merged index is maintained?
>
> Also, if I merge indexes later, will the original indexes be kept, or
> deleted, after the merge? Like I said, I've read it both ways, and I want to
> be sure of how it is with Lucene 2.4.1 / Lucene 2.9
>
> Thanks!
> Kylie
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org